Super-Sampling with a Reservoir
نویسندگان
چکیده
We introduce an alternative to reservoir sampling, a classic and popular algorithm for drawing a fixed-size subsample from streaming data in a single pass. Rather than draw a random sample, our approach performs an online optimization which aims to select the subset that provides the best overall approximation to the full data set, as judged using a kernel two-sample test. This produces subsets which minimize the worst-case relative error when computing expectations of functions in a specified function class, using just the samples from the subset. Kernel functions are approximated using random Fourier features, and the subset of samples itself is stored in a random projection tree. The resulting algorithm runs in a single pass through the whole data set, and has a per-iteration computational complexity logarithmic in the size of the subset. These “supersamples” subsampled from the full data provide a concise summary, as demonstrated empirically on mixture models and the MNIST dataset.
منابع مشابه
Super operator Technique in Investigation of the Dynamics of a Two Non-Interacting Qubit System Coupled to a Thermal Reservoir
In this paper, we clarify the applicability of the super operator technique for describing the dissipative quantum dynamics of a system consists of two qubits coupled with a thermal bath at finite temperature. By using super operator technique, we solve the master equation and find the matrix elements of the density operator. Considering the qubits to be initially prepared in a general mixed st...
متن کاملShiga toxin producing Escherichia coli: identification of non-O157:H7-Super-Shedding cows and related risk factors
BACKGROUND Shiga toxin producing Escherichia coli (STEC) are an important cause of human gastro-enteritis and extraintestinal sequelae, with ruminants, especially cattle, as the major source of infection and reservoir. In this study, the fecal STEC shedding of 133 dairy cows was analyzed over a period of twelve months by monthly sampling with the aim to investigate shedding patterns and risk fa...
متن کاملFeasibility Study of Network Hydraulic Fracture Applied to the Fissured Competent Sand Oil Reservoir
Chang 8 oil deposit, developed in Hohe and Jihe oil fields at the southern Yi-Shan Slop of Ordos Basin, is regarded as a kind of typical sand reservoir formation with super-low porosity, poor permeability, strong anisotropy as well as locally natural faults and fractures. The previous studies believed that matrix reservoir has a good permeability, whereas fracture reservoir has a reverse manner...
متن کاملEffects of various super absorbent concentrations on runoff volume in slopes and various intensity of simulated rainfall in Shahrekord plain
Abstract In order to study the effect of super absorbent on runoff volume in slopes and various intensity of rainfall research was accomplish according to split – factorial blocks method with main treatment and two accessory treatments in three replicate . the main treatment consist of three dominant slopes (10 , 20 , 30 percent ) and accessory treatments consist of five levels of substance su...
متن کاملA Case History on Integrated Fracture Modeling in a Giant Field
In this paper, a case study is used to demonstrate a straightforward methodology of faults, fractures and highpermeability layers integration in a single porosity single permeability (SPSP) reservoir simulation model. The application of this method in the Ghawar Arab-D reservoir indicated an adequate modeling of the water encroachment pattern. The described methodology starts with the identific...
متن کامل